logo of company

From Norms to Neuropsychopathology: Exploring Neuroanatomical & Neuropsychiatric Variation in Alzheimer’s Disease Through Normative Modeling


This study extends analyses conducted by Verdi et al. (2023), who utilized normative modeling techniques to delineate neuroanatomical heterogeneity in Alzheimer’s disease. We employ a similar methodological framework to supplement these insights by integrating both structural MRI data and neuropsychiatric symptom profiles. This integration allowed us to explore more deeply the neuroanatomical and neuropsychiatric underpinnings of Alzheimer’s disease across different phenotypes within our ADNI subset. This study was conducted by the Applied Neuroimaging Statistics Ressearch Laboratory at McLean Hospital & Harvard Medical School.

Author: Adrián Medina

Date: October 3, 2024

Project Overview


Analytic Background:
Our study data were a subset derived from the Alzheimer’s Disease Neuroimaging Initiative (ADNI) ADNI3 wave data bank. This was due to the harmonization of scanner sequence protocols across data collection sites that began during this wave. We only included subjects that had both structural MRI and PET (amyloid) data collected, followed by QA of these data. This analysis leveraged a normative model developed by the Predictive Clinical Neuroscience Group at the Donders Institute and Radboud UMC, which aims to predict and stratify brain disorders on the basis of neuroimaging data. Specifically, we used ‘HBR_lifespan_36K_79sites’, which makes use of the Hierarchical Bayesian Regression algorithm trained on 37,128 subjects from 79 different collection sites, across the human lifespan.

Note

Please refer to the Group’s Normative Modeling Graphical User Interface (GUI) for more information. For reference to the template Python code used to calculate the deviation scores, please look at the Group’s Braincharts GUI.

Considerations:
Ideally, both adaptation and testing sets would be balanced by age, sex, and site (covariates) following something like a 60/40 or 70/30 split of healthy controls. However, given our limited sample size, we decided to keep all of our healthy control and patient data isolated.

In our analysis: The adaptation set is used to calibrate for site (only healthy controls) while the testing set is used exclusively for patient-phenotyped data. Healthy control phenotypes include ‘A-C-’ (amyloid negative, cognitive impairment negative) & ‘A+C-’ (amyloid positive, cognitive impairment negative).

Patient-phenotypes include ‘A+C+’ (amyloid positive, cognitive impairment positive) & ‘A-C+’ (amyloid negative, cognitive impairment positive). As a consequence of both a smaller sample size & larger site numbers (59 sites), our group elected to utilize the MRI manufacturer (3 total) of the subject’s imaging data to act as a pseudo site variable thus giving more power to viably calibrate for potential “site” influences.
- 1 = GE
- 2 = Philips
- 3 = Siemens

Code Workflow


  1. Data Frame Initialization
Important

Some manual edits may be needed to match your imaging variables to the column headers listed in the normative model CSV template (listed in the Normative Modeling GUI). Specifically, go to Compute here! menu bar > Data type, select ThickAvg > Normative Model, select HBR_lifespan_36K_79sites > wait a moment, and the site will populate the line “Download template csv file here.

  1. Covariate Distributions, Data Splitting, & Balance Verification
Note

Upon completion of this step, you are then able to run your adaptation and testing data sets through the normative model deviation scores computation via either the Normative Modeling GUI or Braincharts GUI if you want to recreate the Python code yourself.

  1. Analytic Data Preparation
  2. Demographic Descriptives and Statistical Analysis of Cortical Thickness
  3. Regional Analyses: Across-Groups & Pairwise Comparisons
  4. ADNI vs PDC Preliminary Analyses
  5. Session Information

Data Frame Initialization


Set up the R environment for neuroimaging data analysis by specifying CRAN and ggseg repositories and managing package installations with pacman. Load essential libraries and set a working directory to access neuroimaging and demographic datasets, which are then read and prepared for analysis, focusing on group cortical thickness comparisons and deviation score analyses. Rename the ‘BioMarkers’ variable to ‘phenotype’ across datasets for consistent terminology.

Tip

Front-loading the reading-in of all datasets being used in comprehensive analyses at the beginning of your script ensures that the data environment is consistently prepared, which simplifies the replication and validation process for other users.

Expand Code
# Set CRAN repository for consistent package installation, ggseg for neuroimaging visualizations
options(repos = c(
  ggseg = 'https://ggseg.r-universe.dev',
  CRAN = 'https://cloud.r-project.org'))

# Install and load the pacman package for efficient package management
if (!require(pacman)) install.packages("pacman")
library(pacman)

# Use p_load function to install (if necessary) and load packages
p_load(dplyr, knitr, kableExtra, psych, tidyr, readr, stringr, matrixStats, ggseg, ggseg3d, 
       ggsegExtra, ggsegDesterieux, cowplot, data.table, e1071, ggplot2, plot.matrix, proxy, 
       RPMG, broom, gridExtra, patchwork, caret, tidyverse, fastDummies, sjPlot, ggbeeswarm, 
       lavaan, caTools, bitops, effects, ggeffects, reshape2, ggpubr)

# Specify the 'base_path' where you can find your data files, ASSUMING they're in the same directory, & set it as WD
base_path <- "~/GitHub/ADNI_NormativeModeling/data_files"
setwd(base_path)

# Read in foundational data set
ADNI_274_Merged <- read_csv("ADNI_274_Merged.csv")

### FreeSurfer variables were edited manually to match PCN template
### Covariates: 'age', 'sex', 'site'
# Read in finalized file, to be used ONLY for group cortical thickness comparisons (not deviation score analyses)
df_merged <- read_csv("ADNI_274_Merged_Final.csv")

# Read in computed deviation scores (the file you get back after uploading adaptation and testing sets)
df <- read_csv("ADNI_deviation_scores_BLR.csv")

# Read in PDC deviation data sets
PDC_HBR <- read_csv("PDC_HBR.csv")
PDC_BLR <- read_csv("PDC_BLR.csv")

# Rename biomarker variable to phenotype
ADNI_274_Merged <- dplyr::rename(ADNI_274_Merged, phenotype = "BioMarkers")
df_merged <- dplyr::rename(df_merged, phenotype = "BioMarkers")
df <- dplyr::rename(df, phenotype = "BioMarkers")
1
For referernce, this merged data file is composed of two data sets: NIDP_DX_Dem_NP_MRI.csv contains clinical/non-imaging variables and MRI manufacturer information, & ADNI_lh_rh_thickness_subcort_volumes.csv contains the FreeSurfer brain thickness measures.
2
See details under the important callout in the Code Workflow section.
3
This should be the spreadsheet that will be split into adaptation and testing sets for normative modeling.
4
This dataset contains subject FreeSurfer variable deviation scores for your TESTING set used in the Normative Modeling GUI/code.

Load and prepare Destrieux atlas data for visualization and analysis. The data from the ggseg package, corrected for the correct spelling of Destrieux, is assigned to variables for easy access and plotted to verify the structure and detail of the atlas features and labeled regions.

Caution

The ggseg package adds an extra ‘e’ in the spelling of the name Destrieux, so it’s NOT a typo.

Expand Code
setwd(base_path)

# Assign atlas data to data frames in local environment
desterieux_dims <- desterieux$data
desterieux <- desterieux

# Plot simple atlas features to test data frame with dimension data
plot(desterieux_dims) +
  theme(legend.position = "bottom",
        legend.text = element_text(size = 7)) +
  guides(fill = guide_legend(ncol = 3))

NULL
Expand Code
# Plot atlas ROIs with labels to test data frame with complete vectors
ggplot() +
  geom_brain(atlas = desterieux)

Covariate Distributions, Data Splitting, & Balance Verification


Visualize and analyze the distribution of MRI manufacturers and models within the dataset after renaming the biomarker variable to phenotype. Bar plots display counts of MRI manufacturers and models grouped by manufacturer, with additional cross-tabulation bar plots showing the number of subjects by phenotype across different sites.

Expand Code
# Create a bar plot for MRI manufacturers with counts displayed
ggplot(ADNI_274_Merged, aes(x = MRI_MANU)) +
  geom_bar(stat = "count", fill = "skyblue", color = "black") +
  geom_text(stat = "count", aes(label = ..count..), vjust = -0.5) +
  labs(title = "Count of MRI Manufacturers", x = "MRI Manufacturer", y = "Count") +
  theme_minimal()

Expand Code
# Create separate bar plots for each manufacturer
ggplot(ADNI_274_Merged, aes(x = MRI_MAKE, fill = MRI_MANU)) +
  geom_bar(position = "dodge") +
  geom_text(aes(label = ..count..), stat = "count", position = position_dodge(width = 0.9), vjust = -0.5) +
  facet_wrap(~MRI_MANU, scales = "free_x") +
  labs(title = "Counts of Each MRI Make Grouped by Manufacturer", x = "MRI Make", y = "Count") +
  theme_minimal() +
  theme(axis.text.x = element_text(angle = 45, hjust = 1))  # Rotate x-axis labels for better visibility

Expand Code
# Counting subjects by phenotype within each site
phenotype_site_counts <- df_merged %>%
  group_by(site, phenotype) %>%
  dplyr::summarise(Count = n(), .groups = 'drop')

# Plot with Site on x-axis
plot_site_x <- ggplot(phenotype_site_counts, aes(x = site, y = Count, fill = phenotype)) +
  geom_bar(stat = "identity", position = position_dodge()) +
  geom_text(aes(label = Count), vjust = -0.5, position = position_dodge(width = 0.9)) +
  labs(title = "Subject Counts by Phenotype Across Sites",
       x = "Site",
       y = "Number of Subjects",
       fill = "Phenotype") +
  theme_minimal() +
  theme(axis.text.x = element_text(angle = 0, hjust = 1))

# Plot with Phenotype on x-axis
plot_phenotype_x <- ggplot(phenotype_site_counts, aes(x = phenotype, y = Count, fill = factor(site))) +
  geom_bar(stat = "identity", position = position_dodge()) +  # Use position_dodge() for proper grouping
  geom_text(aes(label = Count), vjust = -0.5, position = position_dodge(width = 0.9)) +  # Adjust text positioning
  labs(title = "Subject Counts Across Phenotype by Site",
       x = "Phenotype",
       y = "Number of Subjects",
       fill = "Site") +
  scale_fill_brewer(palette = "Set1") +  # Using a qualitative palette for distinct sites
  theme_minimal() +
  theme(axis.text.x = element_text(angle = 0, hjust = 1))  # Improve readability of x-axis labels

# Optionally, print both plots to view them in the R environment
print(plot_site_x)

Expand Code
print(plot_phenotype_x)

Create adaptation and testing datasets from the merged data based on specific biomarker criteria, setting aside the groups for separate analyses. The datasets are saved as CSV files, which will be used for further normative model computations after ensuring that the data are balanced in terms of covariates and grouping variables.

Expand Code
# Separate the data based on 'phenotype'
adaptation_data <- df_merged %>%
  filter(phenotype %in% c("A-C-", "A+C-"))

testing_data <- df_merged %>%
  filter(phenotype %in% c("A-C+", "A+C+"))

# Save the adaptation and testing datasets as CSV files, un-comment when verified
# write_csv(adaptation_data, "ADNI_175_Adaptation.csv")
# write_csv(testing_data, "ADNI_99_Testing.csv")
Note

After verifying that the grouping variable and covariates are balanced as desired (i.e., see the next chunk), you can now run them through the normative model computation via the GUI or recreate their code yourself!

Generate visualizations to assess the balance of biomarkers, site, and sex distributions in adaptation and testing datasets, and calculate the average age for each dataset. These plots and summaries ensure the datasets are appropriately balanced before further normative model computations.

Expand Code
# Plotting the distribution of 'BioMarkers' in both datasets
biomarker_distribution <- bind_rows(
  mutate(adaptation_data, dataset = "Adaptation"),
  mutate(testing_data, dataset = "Testing")
) %>%
  group_by(dataset, phenotype) %>%
  dplyr::summarise(Count = n(), .groups = 'drop') %>%
  group_by(dataset) %>%
  mutate(Percentage = (Count / sum(Count)) * 100)

ggplot(biomarker_distribution, aes(x = phenotype, y = Percentage, fill = dataset)) +
  geom_bar(stat = "identity", position = position_dodge()) +
  labs(title = "Distribution of Biomarker", x = "Biomarker", y = "Percentage (%)") +
  theme_minimal()

Expand Code
# Plotting the distribution of 'site' in both datasets
site_distribution <- bind_rows(
  mutate(adaptation_data, dataset = "Adaptation"),
  mutate(testing_data, dataset = "Testing")
) %>%
  group_by(dataset, site) %>%
  dplyr::summarise(Count = n(), .groups = 'drop') %>%
  group_by(dataset) %>%
  mutate(Percentage = (Count / sum(Count)) * 100)

ggplot(site_distribution, aes(x = site, y = Percentage, fill = dataset)) +
  geom_bar(stat = "identity", position = position_dodge()) +
  labs(title = "Distribution of Site", x = "Site", y = "Percentage (%)") +
  theme_minimal()

Expand Code
# Plotting the distribution of 'sex' in both datasets
sex_distribution <- bind_rows(
  mutate(adaptation_data, dataset = "Adaptation"),
  mutate(testing_data, dataset = "Testing")
) %>%
  group_by(dataset, sex) %>%
  dplyr::summarise(Count = n(), .groups = 'drop') %>%
  group_by(dataset) %>%
  mutate(Percentage = (Count / sum(Count)) * 100)

ggplot(sex_distribution, aes(x = sex, y = Percentage, fill = dataset)) +
  geom_bar(stat = "identity", position = position_dodge()) +
  labs(title = "Distribution of Sex", x = "Sex", y = "Percentage (%)") +
  theme_minimal()

Expand Code
# Calculating and comparing the average age
age_summary <- bind_rows(
  mutate(adaptation_data, dataset = "Adaptation"),
  mutate(testing_data, dataset = "Testing")
) %>%
  group_by(dataset) %>%
  dplyr::summarise(Average_Age = mean(age, na.rm = TRUE), .groups = 'drop')

print(age_summary)
# A tibble: 2 × 2
  dataset    Average_Age
  <chr>            <dbl>
1 Adaptation        70.1
2 Testing           72.1

Analytic Data Preparation


Reorganize and recode key variables in the dataset for group cortical thickness comparisons: set the column order for clarity, recode the ‘phenotype’ variable to categorical levels representing health conditions, and standardize categorical labels for ‘site’ and ‘sex’ to enhance readability and analysis consistency.

Expand Code
# Specifying the order of the first few columns for easier viewing
desired_order <- c("Subject_ID", "age", "sex", "site", "phenotype")

# Append the remaining column names that are not specified in desired_order
remaining_cols <- setdiff(names(df_merged), desired_order)

# Combine the specified order with the remaining columns
new_order <- c(desired_order, remaining_cols)

# Reorder the columns in df according to new_order
df_merged <- df_merged[, new_order]

# Recode phenotype variable
df_merged$phenotype <- recode(df_merged$phenotype,
                              "A-C-" = "HC",
                              "A+C-" = "HC",
                              "A-C+" = "MCI",
                              "A+C+" = "AD")
df_merged$phenotype <- as.factor(df_merged$phenotype)
df_merged$phenotype <- factor(df_merged$phenotype, levels = c("HC", "MCI", "AD")) # explicitly set the reference level

# Recode site variable
df_merged$site <- gsub("1", "GE", df_merged$site)
df_merged$site <- gsub("2", "Philips", df_merged$site)
df_merged$site <- gsub("3", "Siemens", df_merged$site)
df_merged$site <- as.factor(df_merged$site)

# Recode sex variable, 0 = females 1 = males
df_merged$sex <- gsub("0", "Female", df_merged$sex)
df_merged$sex <- gsub("1", "Male", df_merged$sex)
df_merged$sex <- factor(df_merged$sex)

Demographic Descriptives and Statistical Analysis of Cortical Thickness


Generate descriptive statistics and visualize the distribution of key variables such as phenotype, age, and sex within the dataset. Employ cross-tabulation to analyze mean cortical thickness and age by phenotype group, and present these distributions using violin and bar plots. Additionally, compute summary statistics for mean cortical thickness and visually represent them using rain cloud plots to highlight differences across phenotype groups.

Expand Code
# Count of each variable of interest
table(df_merged$phenotype)

 HC MCI  AD 
175  40  59 
Expand Code
table(df_merged$sex)

Female   Male 
   153    121 
Expand Code
table(df_merged$site)

     GE Philips Siemens 
     49      48     177 
Expand Code
# Cross-Tabulation of average thickness and phenotype group
describeBy(df_merged$Mean_Thickness, df_merged$phenotype)

 Descriptive statistics by group 
group: HC
   vars   n mean   sd median trimmed  mad  min  max range  skew kurtosis   se
X1    1 175 2.43 0.08   2.43    2.43 0.07 2.17 2.65  0.48 -0.15     0.12 0.01
------------------------------------------------------------ 
group: MCI
   vars  n mean   sd median trimmed  mad  min  max range skew kurtosis   se
X1    1 40  2.4 0.08   2.41     2.4 0.07 2.24 2.59  0.35 -0.1    -0.28 0.01
------------------------------------------------------------ 
group: AD
   vars  n mean   sd median trimmed mad  min  max range  skew kurtosis   se
X1    1 59 2.35 0.09   2.34    2.35 0.1 2.16 2.55  0.39 -0.08    -0.83 0.01
Expand Code
d<-(describeBy(df_merged$Mean_Thickness, df_merged$phenotype))

# Cross-Tabulation of age and phenotype group
describeBy(df_merged$age, df_merged$phenotype)

 Descriptive statistics by group 
group: HC
   vars   n mean   sd median trimmed  mad min max range skew kurtosis   se
X1    1 175 70.1 5.94     69   69.81 5.93  55  90    35 0.53     0.66 0.45
------------------------------------------------------------ 
group: MCI
   vars  n  mean   sd median trimmed mad min max range  skew kurtosis   se
X1    1 40 71.97 8.81     72   72.16 8.9  56  88    32 -0.22    -0.87 1.39
------------------------------------------------------------ 
group: AD
   vars  n  mean   sd median trimmed  mad min max range  skew kurtosis   se
X1    1 59 72.25 7.73     73   72.43 7.41  55  89    34 -0.19    -0.33 1.01
Expand Code
mean(df_merged$age)
[1] 70.83942
Expand Code
sd(df_merged$age)
[1] 6.871648
Expand Code
# Generate the violin plot of age stratified by phenotype categories
ggplot(df_merged, aes(x = phenotype, y = age, fill = phenotype)) +
  geom_violin() +
  labs(title = "Distribution of Age Stratified by Phenotype Groups",
       x = "Phenotype Group", y = "Age") +
  theme_minimal() +
  theme(axis.text.x = element_text(angle = 0, hjust = 1))  # Improve readability of x-axis labels

Expand Code
# Cross-Tabulation of sex & phenotype, visualization
ggplot(df_merged, aes(x = phenotype, fill = sex)) +
  geom_bar(position = "dodge") +
  labs(title = "Distribution of Sex Across Phenotype Groups",
       x = "Phenotype Group",
       y = "Count",
       fill = "Sex") +
  theme_minimal()

Expand Code
# Create the summary Mean_Thickness data frame
filtered_MT_summary <- df_merged %>%
  group_by(phenotype) %>%
  dplyr::summarise(
    score_mean = mean(Mean_Thickness, na.rm = TRUE),
    n = n(), # Sample size for each group
    sem = sd(Mean_Thickness, na.rm = TRUE) / sqrt(n()),
    .groups = 'drop'
  )

# Create the rain cloud plot with separate components
ggplot(df_merged, aes(x = phenotype, y = Mean_Thickness, fill = phenotype, color = phenotype)) +
  PupillometryR::geom_flat_violin(adjust = 1.5, trim = FALSE, alpha = 0.5, position = position_nudge(x = 0.2, y = 0), colour = NA) +
  geom_point(aes(x = as.numeric(phenotype)-0.25, y = Mean_Thickness, colour = phenotype), position = position_jitter(width = .05), size = .5, shape = 20) +
  geom_boxplot(outlier.shape = NA, alpha = 0.5, width = 0.25, position = position_dodge(width = 0.3), colour = "black") +
  geom_point(data = filtered_MT_summary, aes(x = factor(phenotype), y = score_mean), shape = 18, position = position_nudge(x = 0.2)) +
  geom_errorbar(data = filtered_MT_summary, aes(x = factor(phenotype), y = score_mean, ymin = score_mean - sem, ymax = score_mean + sem), width = 0.05, position = position_nudge(x = 0.2)) +
  labs(
    title = "Distribution of Mean Cortical Thickness Stratified by Phenotype Groups", 
    y = "Score", 
    x = "Phenotype", 
    fill = "Phenotype Group",  # Legend title for fill
    color = "Phenotype Group"  # Legend title for color
  ) +
  theme_minimal() +
  theme(
    legend.position = "right",
    legend.title = element_text(face = "bold")  # Make legend title bold
  )

Model the impact of phenotype on mean cortical thickness using linear regression, adjusting for covariates such as age and sex. Visualize these relationships and conduct Tukey’s HSD tests to identify significant differences between phenotype groups, presenting results in both tabular form and through a significance plot of the Tukey HSD test outcomes.

Expand Code
# Fit a linear model of Mean_Thickness as a function of phenotype and store it in s_2
s_2 <- lm(Mean_Thickness ~ phenotype, data = df_merged)

# Create a table for the linear model with confidence intervals
tab_model(s_2, title = "Linear Model: Average Cortical Thickness as a Function of Phenotype")
Linear Model: Average Cortical Thickness as a Function of Phenotype
  Mean Thickness
Predictors Estimates CI p
(Intercept) 2.43 2.42 – 2.44 <0.001
phenotype [MCI] -0.03 -0.06 – 0.00 0.079
phenotype [AD] -0.08 -0.10 – -0.05 <0.001
Observations 274
R2 / R2 adjusted 0.126 / 0.119
Expand Code
# Perform Tukey's Honest Significant Difference test and store results
tukey_results_s2 <- TukeyHSD(aov(Mean_Thickness ~ phenotype, data = df_merged))
print(tukey_results_s2)

Tukey multiple comparisons of means 95% family-wise confidence level

Fit: aov(formula = Mean_Thickness ~ phenotype, data = df_merged)

$phenotype diff lwr upr p adj MCI-HC -0.02631591 -0.06143749 0.008805661 0.1831271 AD-HC -0.07969303 -0.10986243 -0.049523633 0.0000000 AD-MCI -0.05337712 -0.09442260 -0.012331635 0.0067581

Expand Code
tukey_data_s2 <- as.data.frame(tukey_results_s2$phenotype)
tukey_data_s2$Comparison <- rownames(tukey_data_s2)
tukey_data_s2$significant <- ifelse(tukey_data_s2$`p adj` < 0.05, "Significant", "Not Significant")

# Creating the plot of the Tukey HSD test with significance indication
ggplot(tukey_data_s2, aes(y = Comparison, xmin = lwr, xmax = upr, x = diff)) +
  geom_vline(xintercept = 0, linetype = "dashed", color = "black") +
  geom_errorbarh(aes(height = 0.2, color = significant)) +
  geom_point(aes(x = diff, color = significant), size = 2) +
  scale_color_manual(values = c("Significant" = "red", "Not Significant" = "blue")) +
  labs(title = "Tukey HSD Test Results for Mean Cortical Thickness by Phenotype",
       x = "Differences in Mean Levels of Phenotype",
       y = "Comparison",
       color = "P-value Significance") +
  theme_minimal()

Expand Code
# Fit a linear model of Mean_Thickness as a function of phenotype with added covariates
s_1<-lm(Mean_Thickness ~ phenotype + age + sex, data = df_merged)

# Create a table for the linear model including covariates with confidence intervals
tab_model(s_1, title = "Enhanced Linear Model: Average Cortical Thickness as a Function of Phenotype, Adjusted for Age and Sex")
Enhanced Linear Model: Average Cortical Thickness as a Function of Phenotype, Adjusted for Age and Sex
  Mean Thickness
Predictors Estimates CI p
(Intercept) 2.58 2.48 – 2.68 <0.001
phenotype [MCI] -0.02 -0.05 – 0.01 0.221
phenotype [AD] -0.07 -0.10 – -0.05 <0.001
age -0.00 -0.00 – -0.00 0.007
sex [Male] -0.02 -0.04 – 0.00 0.119
Observations 274
R2 / R2 adjusted 0.159 / 0.147
Expand Code
# Plot multiple regression model
ggpredict(s_1, c(terms = "age", "phenotype", "sex")) |>
  plot() +
  labs(title = "Average Cortical Thickness as a Function of Phenotype, Adjusted for Age and Sex",
       x = "Age",
       y = "Mean Cortical Thickness (mm)",
       caption = "Note: 'phenotype' factors, 'HC' = Healthy Control; 'MCI' = Mild Cognitive Impairment; & 'AD' = Alzheimer's Disease.")

Regional Analyses: Across-Groups & Pairwise Comparisons


Conduct ANOVA to assess differences across phenotype groups in brain regions’ FreeSurfer volume measures, then apply False Discovery Rate (FDR) corrections to p-values and calculate F-statistics. Finally, visualize significant regions using a heatmap, categorizing results by significance levels, and incorporating annotations for clearer demarcation of significance groups.

Expand Code
# Apply suffix to all ROI FreeSurfer measures
df.rois <- df_merged %>% rename_at(vars((6:153)), ~ paste0(., '_rois'))

# Run ANOVA analyses for each ROI across phenotype group and extract just the p-values
df.stats <- as.data.frame(sapply(X = df.rois[,grep("_rois", names(df.rois),value = T)], FUN = function(x) summary(aov(x ~ df.rois$phenotype))[[1]][["Pr(>F)"]][1]))

# Rename columns in ANOVA output data frame
names(df.stats) <- "p_value"
setDT(df.stats, keep.rownames = "ROI")

# Verify that changes were made via the first 20 ROIs
head(df.stats, 20)
                             ROI      p_value
                          <char>        <num>
 1:      L_G&S_frontomargin_rois 4.069293e-01
 2:     L_G&S_occipital_inf_rois 1.707275e-03
 3:       L_G&S_paracentral_rois 5.474064e-01
 4:        L_G&S_subcentral_rois 2.408362e-02
 5:  L_G&S_transv_frontopol_rois 8.369497e-01
 6:        L_G&S_cingul-Ant_rois 7.697393e-01
 7:    L_G&S_cingul-Mid-Ant_rois 1.248427e-01
 8:   L_G&S_cingul-Mid-Post_rois 4.273325e-02
 9:  L_G_cingul-Post-dorsal_rois 2.454827e-05
10: L_G_cingul-Post-ventral_rois 1.128955e-01
11:              L_G_cuneus_rois 5.383524e-01
12: L_G_front_inf-Opercular_rois 8.558583e-03
13:   L_G_front_inf-Orbital_rois 1.877900e-02
14:  L_G_front_inf-Triangul_rois 1.318903e-01
15:        L_G_front_middle_rois 1.673422e-03
16:           L_G_front_sup_rois 2.681659e-02
17:   L_G_Ins_lg&S_cent_ins_rois 2.452448e-02
18:       L_G_insular_short_rois 5.823087e-02
19:    L_G_occipital_middle_rois 6.015292e-06
20:       L_G_occipital_sup_rois 1.052752e-02
                             ROI      p_value
Expand Code
# Run False Discovery Rate (FDR) corrections 
df.stats <- cbind(df.stats, p.adjust(df.stats$p_value), method = "fdr")

# Rename column of FDR-corrected p-values
names(df.stats)[3] <- "FDR.pvalue"

# Verify that changes were made
head(df.stats, 20)
                             ROI      p_value   FDR.pvalue method
                          <char>        <num>        <num> <char>
 1:      L_G&S_frontomargin_rois 4.069293e-01 1.0000000000    fdr
 2:     L_G&S_occipital_inf_rois 1.707275e-03 0.1519474631    fdr
 3:       L_G&S_paracentral_rois 5.474064e-01 1.0000000000    fdr
 4:        L_G&S_subcentral_rois 2.408362e-02 1.0000000000    fdr
 5:  L_G&S_transv_frontopol_rois 8.369497e-01 1.0000000000    fdr
 6:        L_G&S_cingul-Ant_rois 7.697393e-01 1.0000000000    fdr
 7:    L_G&S_cingul-Mid-Ant_rois 1.248427e-01 1.0000000000    fdr
 8:   L_G&S_cingul-Mid-Post_rois 4.273325e-02 1.0000000000    fdr
 9:  L_G_cingul-Post-dorsal_rois 2.454827e-05 0.0028475988    fdr
10: L_G_cingul-Post-ventral_rois 1.128955e-01 1.0000000000    fdr
11:              L_G_cuneus_rois 5.383524e-01 1.0000000000    fdr
12: L_G_front_inf-Opercular_rois 8.558583e-03 0.6247765493    fdr
13:   L_G_front_inf-Orbital_rois 1.877900e-02 1.0000000000    fdr
14:  L_G_front_inf-Triangul_rois 1.318903e-01 1.0000000000    fdr
15:        L_G_front_middle_rois 1.673422e-03 0.1506079974    fdr
16:           L_G_front_sup_rois 2.681659e-02 1.0000000000    fdr
17:   L_G_Ins_lg&S_cent_ins_rois 2.452448e-02 1.0000000000    fdr
18:       L_G_insular_short_rois 5.823087e-02 1.0000000000    fdr
19:    L_G_occipital_middle_rois 6.015292e-06 0.0007338656    fdr
20:       L_G_occipital_sup_rois 1.052752e-02 0.7369265130    fdr
                             ROI      p_value   FDR.pvalue method
Expand Code
### Repeat steps to now obtain the F-stat values
# Run ANOVA analyses for each ROI across phenotype group and extract just the F-stat values
df.f_stats <- as.data.frame(sapply(X = df.rois[,grep("_rois", names(df.rois),value = T)], FUN = function(x) summary(aov(x ~ df.rois$phenotype))[[1]][["F value"]][1]))

# Rename columns in ANOVA output data frame
names(df.f_stats) <- "f_stat"
setDT(df.f_stats, keep.rownames = "ROI")

# Verify that changes were made via the first 20 ROIs
head(df.f_stats, 20)
                             ROI     f_stat
                          <char>      <num>
 1:      L_G&S_frontomargin_rois  0.9021055
 2:     L_G&S_occipital_inf_rois  6.5250988
 3:       L_G&S_paracentral_rois  0.6039056
 4:        L_G&S_subcentral_rois  3.7779316
 5:  L_G&S_transv_frontopol_rois  0.1781083
 6:        L_G&S_cingul-Ant_rois  0.2619563
 7:    L_G&S_cingul-Mid-Ant_rois  2.0967579
 8:   L_G&S_cingul-Mid-Post_rois  3.1897430
 9:  L_G_cingul-Post-dorsal_rois 11.0417190
10: L_G_cingul-Post-ventral_rois  2.1989441
11:              L_G_cuneus_rois  0.6206591
12: L_G_front_inf-Opercular_rois  4.8454451
13:   L_G_front_inf-Orbital_rois  4.0338957
14:  L_G_front_inf-Triangul_rois  2.0410037
15:        L_G_front_middle_rois  6.5460925
16:           L_G_front_sup_rois  3.6674894
17:   L_G_Ins_lg&S_cent_ins_rois  3.7592872
18:       L_G_insular_short_rois  2.8733819
19:    L_G_occipital_middle_rois 12.5705761
20:       L_G_occipital_sup_rois  4.6311462
                             ROI     f_stat
Expand Code
# List the ROIs that are significant after FDR correction
df.stats[which(df.stats$FDR.pvalue <0.05),]
                             ROI      p_value   FDR.pvalue method
                          <char>        <num>        <num> <char>
 1:  L_G_cingul-Post-dorsal_rois 2.454827e-05 2.847599e-03    fdr
 2:    L_G_occipital_middle_rois 6.015292e-06 7.338656e-04    fdr
 3: L_G_oc-temp_lat-fusifor_rois 4.435059e-04 4.346357e-02    fdr
 4: L_G_oc-temp_med-Parahip_rois 7.044673e-14 1.035567e-11    fdr
 5:  L_G_pariet_inf-Angular_rois 1.808600e-09 2.550126e-07    fdr
 6: L_G_pariet_inf-Supramar_rois 6.562463e-07 8.531201e-05    fdr
 7:        L_G_parietal_sup_rois 6.957297e-07 8.974913e-05    fdr
 8:           L_G_precuneus_rois 2.798773e-05 3.218589e-03    fdr
 9:    L_G_temp_sup-Lateral_rois 3.102014e-08 4.280780e-06    fdr
10: L_G_temp_sup-Plan_polar_rois 2.020428e-11 2.909417e-09    fdr
11: L_G_temp_sup-Plan_tempo_rois 1.294087e-05 1.552905e-03    fdr
12:        L_G_temporal_inf_rois 1.751079e-06 2.206360e-04    fdr
13:     L_G_temporal_middle_rois 4.475073e-12 6.488856e-10    fdr
14:          L_Lat_Fis-post_rois 4.221147e-05 4.601051e-03    fdr
15:         L_Pole_temporal_rois 1.068375e-09 1.527776e-07    fdr
16: L_S_circular_insula_inf_rois 4.064392e-05 4.521522e-03    fdr
17:   L_S_collat_transv_ant_rois 4.739050e-05 5.118174e-03    fdr
18:        L_S_front_middle_rois 2.619400e-04 2.619400e-02    fdr
19:           L_S_front_sup_rois 1.760990e-04 1.813820e-02    fdr
20:  L_S_interm_prim-Jensen_rois 1.361353e-04 1.418534e-02    fdr
21: L_S_intrapariet&P_trans_rois 4.726364e-09 6.616910e-07    fdr
22:  L_S_oc_sup&transversal_rois 1.740234e-07 2.331914e-05    fdr
23:       L_S_occipital_ant_rois 1.874682e-07 2.493327e-05    fdr
24:         L_S_oc-temp_lat_rois 5.478431e-08 7.450666e-06    fdr
25: L_S_oc-temp_med&Lingual_rois 5.016967e-06 6.221040e-04    fdr
26:  L_S_orbital_med-olfact_rois 3.978362e-04 3.938578e-02    fdr
27:   L_S_parieto_occipital_rois 7.095764e-08 9.579282e-06    fdr
28:         L_S_postcentral_rois 4.053353e-05 4.521522e-03    fdr
29:         L_S_subparietal_rois 4.037073e-05 4.521522e-03    fdr
30:        L_S_temporal_inf_rois 2.095310e-08 2.912480e-06    fdr
31:        L_S_temporal_sup_rois 9.850650e-15 1.457896e-12    fdr
32:  R_G_cingul-Post-dorsal_rois 2.505947e-04 2.531007e-02    fdr
33:    R_G_occipital_middle_rois 6.687443e-05 7.155564e-03    fdr
34: R_G_oc-temp_med-Parahip_rois 4.455720e-08 6.104336e-06    fdr
35:  R_G_pariet_inf-Angular_rois 2.163472e-07 2.855784e-05    fdr
36:        R_G_parietal_sup_rois 1.659870e-05 1.958647e-03    fdr
37:           R_G_precuneus_rois 3.389987e-05 3.864585e-03    fdr
38: R_G_temp_sup-Plan_polar_rois 3.534015e-05 3.993437e-03    fdr
39:        R_G_temporal_inf_rois 1.127403e-04 1.195047e-02    fdr
40:     R_G_temporal_middle_rois 1.394761e-09 1.980561e-07    fdr
41:          R_Lat_Fis-post_rois 1.912784e-04 1.951040e-02    fdr
42:         R_Pole_temporal_rois 3.828107e-06 4.785134e-04    fdr
43: R_S_circular_insula_inf_rois 8.345169e-06 1.009765e-03    fdr
44:  R_S_interm_prim-Jensen_rois 5.029451e-06 6.221040e-04    fdr
45: R_S_intrapariet&P_trans_rois 1.245576e-06 1.581881e-04    fdr
46:  R_S_oc_sup&transversal_rois 1.350984e-04 1.418534e-02    fdr
47: R_S_oc-temp_med&Lingual_rois 1.474772e-05 1.754978e-03    fdr
48:   R_S_parieto_occipital_rois 2.161362e-05 2.528793e-03    fdr
49:         R_S_subparietal_rois 2.382738e-07 3.121387e-05    fdr
50:        R_S_temporal_inf_rois 7.332961e-07 9.386190e-05    fdr
51:        R_S_temporal_sup_rois 2.418071e-12 3.530383e-10    fdr
                             ROI      p_value   FDR.pvalue method
Expand Code
# Assign significance levels
df.stats$Significance <- ifelse(df.stats$FDR.pvalue < 0.001, '***',
                               ifelse(df.stats$FDR.pvalue < 0.01, '**',
                               ifelse(df.stats$FDR.pvalue < 0.05, '*', '')))

# Merge data frames for visualization
df.stats_merged <- merge(df.stats, df.f_stats, by = "ROI")

# Melt the data for plotting
df.melted <- melt(df.stats_merged, id.vars = "ROI", variable.name = "Statistic", value.name = "Value")

# Ensure that 'Value' is numeric
df.melted$Value <- as.numeric(as.character(df.melted$Value))

# Ensure df.stats and df.melted are merged to filter and sort
df.melted <- df.melted %>%
  filter(Statistic == "f_stat") %>%
  left_join(df.stats %>% select(ROI, Significance, FDR.pvalue), by = "ROI") %>%
  filter(FDR.pvalue < 0.05) %>%
  arrange(match(Significance, c("", "*", "**", "***")))  # Order by significance levels

# Remove suffix from ROI list
df.melted$ROI <- gsub("_rois", "", df.melted$ROI)

# Define group positions and labels
group_positions <- data.frame(
  start = c(1, 15, 30),  # will need to adjust depending on your own results
  end = c(14, 29, 51),    # will need to adjust depending on your own results
  label = c("*", "**", "***")  # example labels for significance
)

# Create the heatmap
heatmap_plot <- ggplot(df.melted, aes(x = ROI, y = Statistic, fill = Value)) +
  geom_tile() +
  scale_fill_gradient(low = "blue", high = "red") +
  labs(title = "Heatmap of ANOVA Statistics Across ROIs", x = "Region of Interest", y = "F-Statistic") +
  theme_minimal() +
  theme(axis.text.x = element_text(angle = 90, hjust = 1))

# Add custom brackets using annotations
for (i in 1:nrow(group_positions)) {
  heatmap_plot <- heatmap_plot +
    annotate("segment", x = group_positions$start[i], xend = group_positions$end[i], y = Inf, yend = Inf, yjust = 1.5, color = "black", size = 0.5) +
    annotate("text", x = (group_positions$start[i] + group_positions$end[i]) / 2, y = Inf, label = group_positions$label[i], vjust = 2, hjust = 0.5)
}

# Print the cutomized heatmap
print(heatmap_plot)

Adjust brain region names in the data to align with the Destrieux atlas nomenclature for both left and right hemispheres, performing extensive renaming to match the standard naming convention. This ensures the region names are compatible with established neuroimaging atlas formats for subsequent analyses.

Expand Code
# Remove suffix from ROI list
df.stats$ROI <- as.data.frame(gsub("_rois", "", df.stats$ROI))

### Rename ROIs to assign nomenclature consistent with the 'desterieux' atlas package
# Left hemisphere ROIs
left.df.stats <- df.stats %>%
  filter(str_detect(ROI, "L_"))
x <- gsub("L_", "", left.df.stats$ROI)
y <- gsub("G&S_", "G and S ", x)
z <- gsub("G_", "G ", y)
z1 <- gsub("S_", "S ", z)
z2 <- gsub("_", " ", z1)
z3 <- gsub(" bin", "", z2)
z4 <- gsub("\\.", " ", z3)
z5 <- gsub("cingul ", "cingul-", z4)
z6 <- gsub("Mid ", "Mid-", z5)
z7 <- gsub("Post ", "Post-", z6)
z8 <- gsub("inf ", "inf-", z7)
z9 <- gsub("med ", "med-", z8)
z10 <- gsub("sup ", "sup-", z9)
z11 <- gsub("Fis ant ", "Fis-ant-", z10)
z12 <- gsub("precentral ", "precentral-", z11)
z13 <- gsub("Fis pos ", "Fis-pos-", z12)
z14 <- gsub("lg&S", "lg and S", z13)
z15 <- gsub("oc temp", "oc-temp", z14)
z16 <- gsub("sup and transversal", "sup-transversal", z15)
z17 <- gsub("orbital H Shaped", "orbital-H Shaped", z16)
z18 <- gsub("oc sup&transversal", "oc sup and transversal", z17)
z19 <- gsub("prim Jensen", "prim-Jensen", z18)
z20 <- gsub("S oc-temp med&Lingual", "S oc-temp med and Lingual", z19)
z21 <- gsub("lat fusifor", "lat-fusifor", z20)
z22 <- gsub("middle&Lunatus", "middle and Lunatus", z21)
z23 <- gsub("intrapariet&P trans", "intrapariet and P trans", z22)
renamed_ROIs <- gsub("Lat Fis post", "Lat Fis-post", z23)

desterieux_ROIs <- as.data.frame(desterieux_dims %>% filter(hemi == "left"))$region
compare_lists <- cbind(sort(renamed_ROIs), sort(unique(desterieux_ROIs)))
list_matches <- compare_lists[,1] %in% compare_lists[,2]
compare_lists[!list_matches,]
     [,1] [,2]
Expand Code
## if no mismatches, than add to data.frame as region
left.df.stats$region <- renamed_ROIs

# Right hemisphere ROIs
right.df.stats <- df.stats %>%
  filter(str_detect(ROI, "R_"))
x <- gsub("R_", "", right.df.stats$ROI)
y <- gsub("G&S_", "G and S ", x)
z <- gsub("G_", "G ", y)
z1 <- gsub("S_", "S ", z)
z2 <- gsub("_", " ", z1)
z3 <- gsub(" bin", "", z2)
z4 <- gsub("\\.", " ", z3)
z5 <- gsub("cingul ", "cingul-", z4)
z6 <- gsub("Mid ", "Mid-", z5)
z7 <- gsub("Post ", "Post-", z6)
z8 <- gsub("inf ", "inf-", z7)
z9 <- gsub("med ", "med-", z8)
z10 <- gsub("sup ", "sup-", z9)
z11 <- gsub("Fis ant ", "Fis-ant-", z10)
z12 <- gsub("precentral ", "precentral-", z11)
z13 <- gsub("Fis pos ", "Fis-pos-", z12)
z14 <- gsub("lg&S", "lg and S", z13)
z15 <- gsub("oc temp", "oc-temp", z14)
z16 <- gsub("sup and transversal", "sup-transversal", z15)
z17 <- gsub("orbital H Shaped", "orbital-H Shaped", z16)
z18 <- gsub("oc sup&transversal", "oc sup and transversal", z17)
z19 <- gsub("prim Jensen", "prim-Jensen", z18)
z20 <- gsub("S oc-temp med&Lingual", "S oc-temp med and Lingual", z19)
z21 <- gsub("lat fusifor", "lat-fusifor", z20)
z22 <- gsub("middle&Lunatus", "middle and Lunatus", z21)
z23 <- gsub("intrapariet&P trans", "intrapariet and P trans", z22)
renamed_ROIs <- gsub("Lat Fis post", "Lat Fis-post", z23)

desterieux_ROIs <- as.data.frame(desterieux_dims %>% filter(hemi == "right"))$region
compare_lists <- cbind(sort(renamed_ROIs), sort(unique(desterieux_ROIs)))
list_matches <- compare_lists[,1] %in% compare_lists[,2]
compare_lists[!list_matches,]
     [,1] [,2]
Expand Code
## if no mismatches, than add to data.frame as region
right.df.stats$region <- renamed_ROIs

Visualize significant FDR-corrected p-values for brain regions across groups using gradient-filled maps for both left and right hemispheres, highlighted with different colors to represent statistical significance levels. These plots are organized in a grid layout to facilitate comparison between hemispheres.

Expand Code
# Left hemisphere
left_pvalues <- ggseg(.data=left.df.stats, atlas = desterieux, mapping=aes(fill=FDR.pvalue), hemisphere = "left", colour = "white", size = 0.2) + 
scale_fill_gradientn(limits = c(0,0.05), colours =  rainbow.colors(5))

# Right hemisphere
right_pvalues <- ggseg(.data=right.df.stats, atlas = desterieux, mapping=aes(fill=FDR.pvalue), hemisphere = "right", colour = "white", size = 0.2) +
  scale_fill_gradientn(limits = c(0,0.05), colours = rainbow.colors(5))

# Add titles to individual plots
left_pvalues <- left_pvalues + labs(title = "Between-Group Comparisons")
right_pvalues <- right_pvalues + labs(title = "Between-Group Comparisons")

cowplot::plot_grid(left_pvalues,right_pvalues, nrow = 2, labels = "AUTO")

Conduct pairwise statistical comparisons between controls and Alzheimer’s disease patients on cortical thickness measures. Perform t-tests on each brain region, followed by FDR corrections to adjust for multiple comparisons, organizing the results into a clean data frame with significance levels marked.

Expand Code
# Data frame preparation
df.CA <- df_merged[grep("HC|AD", df_merged$phenotype), ] # Subset just Controls and Alzheimer's ("CA")

# Perform t-test and extract t-stat value, p-value, and parameters
df.CA_stats <- as.data.frame(sapply(df.CA[6:153], function(x) t.test(x ~ df.CA$phenotype)$statistic))
df.CA_stats1 <- as.data.frame(sapply(df.CA[6:153], function(x) t.test(x ~ df.CA$phenotype)$p.value))
df.CA_stats2 <- as.data.frame(sapply(df.CA[6:153], function(x) t.test(x ~ df.CA$phenotype)$parameter))

# Merge data frames and remove the subsequent, intermediate data frames
df.CA_stats <- cbind(df.CA_stats, df.CA_stats1)
df.CA_stats <- cbind(df.CA_stats, df.CA_stats2)
rm(df.CA_stats1, df.CA_stats2)

# Clean up the t-test data frame
df.CA_stats$t_statistic <- df.CA_stats[2] # Rename column
names(df.CA_stats) <- c('statistic', 'p.value', 'parameter')
df.CA_stats[4] <- NULL

# Perform FDR correction on t-test data frame
df.CA_stats <- cbind(df.CA_stats, p.adjust(df.CA_stats$p.value), method = "fdr")
names(df.CA_stats)[4] <- "FDR.pvalue"
df.CA_stats[5] <- NULL

# Clean up the FDR-corrected data frame
df.CA_stats <- tibble::rownames_to_column(df.CA_stats, "ROI") # Create ROI column
df.CA_stats$ROI <- gsub("\\.t$", "", df.CA_stats$ROI)

# List the ROIs that are significant after FDR correction
df.CA_stats_sig <- df.CA_stats[which(df.CA_stats$FDR.pvalue <0.05),]

Conduct t-tests between control and MCI groups, extracting t-statistics and p-values for brain regions, followed by FDR correction to account for multiple comparisons, leading to a cleaned and organized presentation of significant regions.

Expand Code
# Data frame preparation
df.CM <- df_merged[grep("HC|MCI", df_merged$phenotype), ] # Subset just Controls and MCI ("CM")

# Perform t-test and extract t-stat value, p-value, and parameters
df.CM_stats <- as.data.frame(sapply(df.CM[6:153], function(x) t.test(x ~ df.CM$phenotype)$statistic))
df.CM_stats1 <- as.data.frame(sapply(df.CM[6:153], function(x) t.test(x ~ df.CM$phenotype)$p.value))
df.CM_stats2 <- as.data.frame(sapply(df.CM[6:153], function(x) t.test(x ~ df.CM$phenotype)$parameter))

# Merge data frames and remove the subsequent, intermediate data frames
df.CM_stats <- cbind(df.CM_stats, df.CM_stats1)
df.CM_stats <- cbind(df.CM_stats, df.CM_stats2)
rm(df.CM_stats1, df.CM_stats2)

# Clean up the t-test data frame
df.CM_stats$t_statistic <- df.CM_stats[2] # Rename column
names(df.CM_stats) <- c('statistic', 'p.value', 'parameter')
df.CM_stats[4] <- NULL

# Perform FDR correction on t-test data frame
df.CM_stats <- cbind(df.CM_stats, p.adjust(df.CM_stats$p.value), method = "fdr")
names(df.CM_stats)[4] <- "FDR.pvalue"
df.CM_stats[5] <- NULL

# Clean up the FDR-corrected data frame
df.CM_stats <- tibble::rownames_to_column(df.CM_stats, "ROI") # Create ROI column
df.CM_stats$ROI <- gsub("\\.t$", "", df.CM_stats$ROI)

# List the ROIs that are significant after FDR correction
df.CM_stats_sig <- df.CM_stats[which(df.CM_stats$FDR.pvalue <0.05),]

Perform pairwise t-tests between MCI and Alzheimer’s groups, extracting critical statistics for comparative analysis, apply FDR corrections for multiple testing, and combine significant results to visualize in a lollipop plot, differentiating groups with colors and annotations for significance levels.

Expand Code
# Data frame preparation
df.MA <- df_merged[grep("MCI|AD", df_merged$phenotype), ] # Subset just MCI and Alzheimer's ("MA")

### MCI vs Alzheimer's t-test
# Perform t-test and extract t-stat value, p-value, and parameters
df.MA_stats <- as.data.frame(sapply(df.MA[6:153], function(x) t.test(x ~ df.MA$phenotype)$statistic))
df.MA_stats1 <- as.data.frame(sapply(df.MA[6:153], function(x) t.test(x ~ df.MA$phenotype)$p.value))
df.MA_stats2 <- as.data.frame(sapply(df.MA[6:153], function(x) t.test(x ~ df.MA$phenotype)$parameter))

# Merge data frames and remove the subsequent, intermediate data frames
df.MA_stats <- cbind(df.MA_stats, df.MA_stats1)
df.MA_stats <- cbind(df.MA_stats, df.MA_stats2)
rm(df.MA_stats1, df.MA_stats2)

# Clean up the t-test data frame
df.MA_stats$t_statistic <- df.MA_stats[2] # Rename column
names(df.MA_stats) <- c('statistic', 'p.value', 'parameter')
df.MA_stats[4] <- NULL

# Perform FDR correction on t-test data frame
df.MA_stats <- cbind(df.MA_stats, p.adjust(df.MA_stats$p.value), method = "fdr")
names(df.MA_stats)[4] <- "FDR.pvalue"
df.MA_stats[5] <- NULL

# Clean up the FDR-corrected data frame
df.MA_stats <- tibble::rownames_to_column(df.MA_stats, "ROI") # Create ROI column
df.MA_stats$ROI <- gsub("\\.t$", "", df.MA_stats$ROI)

# List the ROIs that are significant after FDR correction
df.MA_stats_sig <- df.MA_stats[which(df.MA_stats$FDR.pvalue <0.05),]

# Combine the significant results for easy plotting
combined_stats_sig <- bind_rows(
  df.CA_stats_sig %>% mutate(Comparison = "HC vs AD"),
  df.MA_stats_sig %>% mutate(Comparison = "MCI vs AD")
)

# Assign significance levels
combined_stats_sig$Significance <- ifelse(combined_stats_sig$FDR.pvalue < 0.001, '***',
                               ifelse(combined_stats_sig$FDR.pvalue < 0.01, '**',
                               ifelse(combined_stats_sig$FDR.pvalue < 0.05, '*', '')))

# Create the lollipop plot
ggplot(combined_stats_sig, aes(x = reorder(ROI, statistic), y = statistic, color = Comparison)) +
  geom_segment(aes(yend = 0, xend = ROI), size = 1, linetype = "dashed") +
  geom_point(size = 3) +
  labs(
    title = "Significant Differences in Brain Regions Across Comparisons",
    x = "Region of Interest",
    y = "T-Statistic Value",
    color = "Comparison",
    caption = "Note: 'HC' = Healthy Control; 'MCI' = Mild Cognitive Impairment; & 'AD' = Alzheimer's Disease.") +
  theme_minimal() +
  theme(
    legend.position = "top",
    axis.text.x = element_text(angle = 90, hjust = 1, size = 7.5)  # Rotate x labels for better visibility
  )

Define a function to rename region of interest (ROI) labels for brain imaging data to align with the Destrieux atlas nomenclature. Adjust ROI names based on hemisphere, applying systematic replacements to match atlas standards, and then check for mismatches against the atlas database, allowing for application across multiple datasets and hemispheres.

Expand Code
# Rename ROIs to assign nomenclature consistent with the desterieux atlas package
rename_rois <- function(df, hemi, desterieux_dims) {
  # Filter based on hemisphere
  hemi_prefix <- ifelse(hemi == "left", "L_", "R_")
  df_filtered <- df %>%
    filter(str_detect(ROI, hemi_prefix))
  
  # Perform renaming steps
  x <- gsub(paste0(hemi_prefix), "", df_filtered$ROI)
  patterns <- c("G&S_", "G_", "S_", "_", " bin", "\\.", "cingul ", "Mid ", "Post ", 
                "inf ", "med ", "sup ", "Fis ant ", "precentral ", "Fis pos ", "lg&S", 
                "oc temp", "sup and transversal", "orbital H Shaped", "oc sup&transversal", 
                "prim Jensen", "S oc-temp med&Lingual", "lat fusifor", "middle&Lunatus", 
                "intrapariet&P trans", "Lat Fis post")
  replacements <- c("G and S ", "G ", "S ", " ", "", " ", "cingul-", "Mid-", "Post-", 
                    "inf-", "med-", "sup-", "Fis-ant-", "precentral-", "Fis-pos-", 
                    "lg and S", "oc-temp", "sup-transversal", "orbital-H Shaped", 
                    "oc sup and transversal", "prim-Jensen", "S oc-temp med and Lingual", 
                    "lat-fusifor", "middle and Lunatus", "intrapariet and P trans", 
                    "Lat Fis-post")
  for (i in seq_along(patterns)) {
    x <- gsub(patterns[i], replacements[i], x)
  }
  
  # Match with desterieux ROIs
  desterieux_ROIs <- as.data.frame(desterieux_dims %>% filter(hemi == hemi))$region
  compare_lists <- cbind(sort(x), sort(unique(desterieux_ROIs)))
  list_matches <- compare_lists[,1] %in% compare_lists[,2]
  
  if (any(!list_matches)) {
    warning("There are mismatches in ROI names for the ", hemi, " hemisphere.")
  }
  
  df_filtered$region <- x
  return(df_filtered)
}

# Apply the function to each dataset and hemisphere
left.df.CA_stats <- rename_rois(df.CA_stats, "left", desterieux_dims)
right.df.CA_stats <- rename_rois(df.CA_stats, "right", desterieux_dims)
left.df.CM_stats <- rename_rois(df.CM_stats, "left", desterieux_dims)
right.df.CM_stats <- rename_rois(df.CM_stats, "right", desterieux_dims)
left.df.MA_stats <- rename_rois(df.MA_stats, "left", desterieux_dims)
right.df.MA_stats <- rename_rois(df.MA_stats, "right", desterieux_dims)
1
Automated loop version

Visualize the significance of FDR-corrected p-values for pairwise comparisons (Control vs. Alzheimer’s, Control vs. MCI, and MCI vs. Alzheimer’s) across both hemispheres using the Destrieux atlas. Generate separate gradient-filled plots for each comparison and hemisphere, labeling each to reflect the specific groups compared. Display these plots using a grid layout to assess differences effectively.

Expand Code
### Control vs Alzheimer's
# Left hemisphere
left_CA_pvalues <- ggseg(.data=left.df.CA_stats, atlas = desterieux, mapping=aes(fill=FDR.pvalue), hemisphere = "left", colour = "white", size = 0.2) + 
scale_fill_gradientn(limits = c(0,0.05), colours =  rainbow.colors(5))

# Right hemisphere
right_CA_pvalues <- ggseg(.data=right.df.CA_stats, atlas = desterieux, mapping=aes(fill=FDR.pvalue), hemisphere = "right", colour = "white", size = 0.2) +
  scale_fill_gradientn(limits = c(0,0.05), colours = rainbow.colors(5))

### Control vs MCI
# Left hemisphere
left_CM_pvalues <- ggseg(.data=left.df.CM_stats, atlas = desterieux, mapping=aes(fill=FDR.pvalue), hemisphere = "left", colour = "white", size = 0.2) + 
scale_fill_gradientn(limits = c(0,0.05), colours =  rainbow.colors(5))

# Right hemisphere
right_CM_pvalues <- ggseg(.data=right.df.CM_stats, atlas = desterieux, mapping=aes(fill=FDR.pvalue), hemisphere = "right", colour = "white", size = 0.2) +
  scale_fill_gradientn(limits = c(0,0.05), colours = rainbow.colors(5))

### MCI vs Alzheimer's
# Left hemisphere
left_MA_pvalues <- ggseg(.data=left.df.MA_stats, atlas = desterieux, mapping=aes(fill=FDR.pvalue), hemisphere = "left", colour = "white", size = 0.2) + 
scale_fill_gradientn(limits = c(0,0.05), colours =  rainbow.colors(5))

# Right hemisphere
right_MA_pvalues <- ggseg(.data=right.df.MA_stats, atlas = desterieux, mapping=aes(fill=FDR.pvalue), hemisphere = "right", colour = "white", size = 0.2) +
  scale_fill_gradientn(limits = c(0,0.05), colours = rainbow.colors(5))

# Add titles to individual plots
left_CA_pvalues <- left_CA_pvalues + labs(title = "Healthy Controls vs Alzheimer's Group")
right_CA_pvalues <- right_CA_pvalues + labs(title = "Healthy Controls vs Alzheimer's Group")

left_CM_pvalues <- left_CM_pvalues + labs(title = "Healthy Controls vs MCI Group")
right_CM_pvalues <- right_CM_pvalues + labs(title = "Healthy Controls vs MCI Group")

left_MA_pvalues <- left_MA_pvalues + labs(title = "MCI Group vs Alzheimer's Group")
right_MA_pvalues <- right_MA_pvalues + labs(title = "MCI Group vs Alzheimer's Group")

# Print plots to verify that they were correctly constructed
cowplot::plot_grid(left_CA_pvalues, right_CA_pvalues, nrow = 2)

Expand Code
cowplot::plot_grid(left_CM_pvalues, right_CM_pvalues, nrow = 2)

Expand Code
cowplot::plot_grid(left_MA_pvalues, right_MA_pvalues, nrow = 2)

Z-Score Statistics Pipeline


Transform and reorganize analytic variables for comprehensive analyses by converting the ‘phenotype’ column into dummy variables, removing superfluous columns, and systematically renaming and reordering columns for clarity. This process ensures data columns align more effectively with analytical needs, including adjusting categorical variables for models and ensuring ROI names are standardized without duplicates.

Expand Code
# Use dummy_cols to create dummy variables for the 'phenotype' column
df <- dummy_cols(df, select_columns = "phenotype", remove_first_dummy = TRUE, remove_selected_columns = TRUE)

# Rename dummy variable column back to phenotype
df <- dplyr::rename(df, phenotype = "phenotype_A+C+")

# Omit the 'index' column from the data frame, unnecessary variable
df <- df[, -which(names(df) == "index")]

# Specifying the order of the first few columns for easier viewing (and the list of columns to exclude from the ROIs)
desired_order <- c("Subject_ID", "age", "sex", "site", "phenotype")

# Append the remaining column names that are not specified in desired_order
remaining_cols <- setdiff(names(df), desired_order)

# Combine the specified order with the remaining columns
new_order <- c(desired_order, remaining_cols)

# Reorder the columns in df according to new_order
df <- df[, new_order]

# Extract column names, remove those in desired_order, and remove z-score suffixes
ROI <- names(df)[!names(df) %in% desired_order & !grepl("_Z_predict", names(df))]

# Further clean the ROI names to ensure no duplicates from z-score names
ROI <- unique(gsub("_Z_predict", "", ROI))

# Recode phenotype variable
df$phenotype <- gsub("0", "MCI", df$phenotype)
df$phenotype <- gsub("1", "AD", df$phenotype)
df$phenotype <- as.factor(df$phenotype)
df$phenotype <- factor(df$phenotype, levels = c("MCI", "AD")) # explicitly set the reference level

# Recode site variable
df$site <- gsub("1", "GE", df$site)
df$site <- gsub("2", "Philips", df$site)
df$site <- gsub("3", "Siemens", df$site)
df$site <- as.factor(df$site)

# Recode sex variable, 0= females 1= males
df$sex <- gsub("0", "Female", df$sex)
df$sex <- gsub("1", "Male", df$sex)
df$sex <- factor(df$sex)

Identify and quantify outliers in brain region deviation scores by applying a defined statistical threshold, creating a binary representation for outlier detection, and summing these to obtain a total outlier score for each subject. The process includes visualizing these scores by phenotype and performing a basic regression analysis to examine the relationship between phenotypes and outlier scores, with additional summarization of the outlier distributions across phenotypes.

Expand Code
# Establish outlier threshold
outlier_threshold <- -1.96 # bottom 2.5%, as used in the Verdi et al., 2023 paper

# Apply threshold to z-score data to create dummy columns
df3 <- as.data.frame(ifelse(df[,6:153] < outlier_threshold,1,0))

# Rename all binarized columns to have the suffix "_bin"
df3 <- df3 %>% rename_all(paste0, "_bin")

# Sum the dummy column values to create a total outlier score
df$total_outlier_score <- rowSums(df3)

# Merge the two data frames
df <- cbind(df, df3)

# Remove df3 from your R environment as it's no longer needed
rm(df3)

# Create a group difference box plot with colored outlines and filled points
ggplot(df, aes(x = phenotype, y = total_outlier_score, color = phenotype, fill = phenotype)) +
  geom_boxplot(outlier.shape = NA, fill = NA, size = 1) +  # Draw box plots with no fill, only outlines
  geom_jitter(width = 0.2, size = 2, shape = 21) +  # Add jittered points with the same color
  labs(x = "Phenotype", y = "Total Outlier Score") +
  theme_minimal() +
  scale_color_manual(values=c("MCI" ="mediumturquoise", "AD"="lightslateblue")) +  # Color outline by phenotype
  scale_fill_manual(values=c("MCI" ="mediumturquoise", "AD"="lightslateblue")) +  # Fill points by phenotype
  theme(legend.position = "right")

Expand Code
# Calculate descriptives of total outlier score grouped by phenotype group
describeBy(df$total_outlier_score, df$phenotype, IQR = T)

 Descriptive statistics by group 
group: MCI
   vars  n mean   sd median trimmed  mad min max range skew kurtosis   se IQR
X1    1 40  5.3 6.79      2       4 2.97   0  28    28 1.59     1.78 1.07   7
------------------------------------------------------------ 
group: AD
   vars  n  mean    sd median trimmed  mad min max range skew kurtosis   se
X1    1 59 11.83 15.46      6     8.9 7.41   0  66    66 1.85     2.86 2.01
    IQR
X1 12.5
Expand Code
# Basic regression analysis with phenoype as predictor and total outlier score as outcome
s1<- lm(total_outlier_score ~ phenotype, data = df)
tab_model(s1, show.ci = 0.95, title = "Linear Model: Total Outlier Score as a Function of Phenotype")
Linear Model: Total Outlier Score as a Function of Phenotype
  total outlier score
Predictors Estimates CI p
(Intercept) 5.30 1.31 – 9.29 0.010
phenotype [AD] 6.53 1.37 – 11.70 0.014
Observations 99
R2 / R2 adjusted 0.061 / 0.051

ADNI vs PDC Preliminary Analyses


Evaluate correlations between brain region deviation scores derived from hierarchical versus linear Bayesian regression models by merging relevant datasets and calculating Pearson correlation coefficients for predefined brain region subsets. The code handles large data by segmenting regions into smaller subsets to prevent memory overload and applies the Benjamini-Hochberg procedure to correct for multiple comparisons. The results are visualized using scatter and lollipop plots to highlight significant correlations across the brain regions, categorizing significance levels based on adjusted p-values.

Expand Code
# Merge datasets on SubjectID
merged_data <- inner_join(PDC_HBR, PDC_BLR, by = "SubjectID", suffix = c(".HBR", ".BLR"))

# Initialize a list to store all correlation results
all_correlation_results <- list()

# Define subsets of brain regions to avoid buffer overflow
brain_regions_1 <- c("L_G&S_frontomargin", "L_G&S_occipital_inf", "L_G&S_paracentral", 
                   "L_G&S_subcentral", "L_G&S_transv_frontopol", "L_G&S_cingul-Ant",
                   "L_G&S_cingul-Mid-Ant", "L_G&S_cingul-Mid-Post", "L_G_cingul-Post-dorsal",
                   "L_G_cingul-Post-ventral", "L_G_cuneus", "L_G_front_inf-Opercular", 
                   "L_G_front_inf-Orbital", "L_G_front_inf-Triangul", "L_G_front_middle")
brain_regions_2 <- c("L_G_front_sup", "L_G_Ins_lg&S_cent_ins", "L_G_insular_short", 
                   "L_G_occipital_middle", "L_G_occipital_sup", "L_G_oc-temp_lat-fusifor",
                   "L_G_oc-temp_med-Lingual", "L_G_oc-temp_med-Parahip", "L_G_orbital", 
                   "L_G_pariet_inf-Angular", "L_G_pariet_inf-Supramar", "L_G_parietal_sup", 
                   "L_G_postcentral", "L_G_precentral", "L_G_precuneus")
brain_regions_3 <- c("L_G_rectus", "L_G_subcallosal", "L_G_temp_sup-G_T_transv", "L_G_temp_sup-Lateral",
                   "L_G_temp_sup-Plan_polar", "L_G_temp_sup-Plan_tempo", "L_G_temporal_inf",
                   "L_G_temporal_middle", "L_Lat_Fis-ant-Horizont", "L_Lat_Fis-ant-Vertical",
                   "L_Lat_Fis-post", "L_Pole_occipital", "L_Pole_temporal", "L_S_calcarine", 
                   "L_S_central")
brain_regions_4 <- c("L_S_cingul-Marginalis", "L_S_circular_insula_ant",
                   "L_S_circular_insula_inf", "L_S_circular_insula_sup", "L_S_collat_transv_ant", 
                   "L_S_collat_transv_post", "L_S_front_inf", "L_S_front_middle", "L_S_front_sup", 
                   "L_S_interm_prim-Jensen", "L_S_intrapariet&P_trans", "L_S_oc_middle&Lunatus", 
                   "L_S_oc_sup&transversal", "L_S_occipital_ant", "L_S_oc-temp_lat")
brain_regions_5 <- c("L_S_oc-temp_med&Lingual", "L_S_orbital_lateral", "L_S_orbital_med-olfact", 
                   "L_S_orbital-H_Shaped", "L_S_parieto_occipital", "L_S_pericallosal", 
                   "L_S_postcentral", "L_S_precentral-inf-part", "L_S_precentral-sup-part", 
                   "L_S_suborbital", "L_S_subparietal", "L_S_temporal_inf", "L_S_temporal_sup", 
                   "L_S_temporal_transverse", "R_G&S_frontomargin")
brain_regions_6 <- c("R_G&S_occipital_inf", "R_G&S_paracentral", "R_G&S_subcentral", "R_G&S_transv_frontopol", 
                   "R_G&S_cingul-Ant", "R_G&S_cingul-Mid-Ant", "R_G&S_cingul-Mid-Post", 
                   "R_G_cingul-Post-dorsal", "R_G_cingul-Post-ventral", "R_G_cuneus", 
                   "R_G_front_inf-Opercular", "R_G_front_inf-Orbital", "R_G_front_inf-Triangul", 
                   "R_G_front_middle", "R_G_front_sup")
brain_regions_7 <- c("R_G_Ins_lg&S_cent_ins", "R_G_insular_short", "R_G_occipital_middle", "R_G_occipital_sup", 
                   "R_G_oc-temp_lat-fusifor", "R_G_oc-temp_med-Lingual", "R_G_oc-temp_med-Parahip", 
                   "R_G_orbital", "R_G_pariet_inf-Angular", "R_G_pariet_inf-Supramar", 
                   "R_G_parietal_sup", "R_G_postcentral", "R_G_precentral", "R_G_precuneus", 
                   "R_G_rectus")
brain_regions_8 <- c("R_G_subcallosal", "R_G_temp_sup-G_T_transv", "R_G_temp_sup-Lateral",
                   "R_G_temp_sup-Plan_polar", "R_G_temp_sup-Plan_tempo", "R_G_temporal_inf", 
                   "R_G_temporal_middle", "R_Lat_Fis-ant-Horizont", "R_Lat_Fis-ant-Vertical", 
                   "R_Lat_Fis-post", "R_Pole_occipital", "R_Pole_temporal", "R_S_calcarine", 
                   "R_S_central", "R_S_cingul-Marginalis")
brain_regions_9 <- c("R_S_circular_insula_ant", "R_S_circular_insula_inf", "R_S_circular_insula_sup", "R_S_collat_transv_ant", 
                   "R_S_collat_transv_post", "R_S_front_inf", "R_S_front_middle", "R_S_front_sup", 
                   "R_S_interm_prim-Jensen", "R_S_intrapariet&P_trans", "R_S_oc_middle&Lunatus", 
                   "R_S_oc_sup&transversal", "R_S_occipital_ant", "R_S_oc-temp_lat", 
                   "R_S_oc-temp_med&Lingual")
brain_regions_10 <- c("R_S_orbital_lateral", "R_S_orbital_med-olfact", 
                   "R_S_orbital-H_Shaped", "R_S_parieto_occipital", "R_S_pericallosal", 
                   "R_S_postcentral", "R_S_precentral-inf-part", "R_S_precentral-sup-part", 
                   "R_S_suborbital", "R_S_subparietal", "R_S_temporal_inf", "R_S_temporal_sup", 
                   "R_S_temporal_transverse")

# List to store all brain region subsets
brain_regions_list <- list(brain_regions_1, brain_regions_2, brain_regions_3, brain_regions_4, brain_regions_5, brain_regions_6, brain_regions_7, brain_regions_8, brain_regions_9, brain_regions_10)

# Process each subset of brain regions
for (brain_regions in brain_regions_list) {
    correlation_results <- list()
    for(region in brain_regions) {
        region_hbr <- paste(region, ".HBR", sep = "")
        region_blr <- paste(region, ".BLR", sep = "")
        
        # Ensure both variables are present in the dataset
        if(region_hbr %in% names(merged_data) && region_blr %in% names(merged_data)) {
            # Use cor.test to get both correlation and p-value
            test_result <- cor.test(merged_data[[region_hbr]], merged_data[[region_blr]], 
                                    method = "pearson", use = "complete.obs")
            
            # Store results in a list
            correlation_results[[region]] <- list(
                Correlation = test_result$estimate,
                P_Value = test_result$p.value
            )
        }
    }
    # Add subset results to the master list
    all_correlation_results <- c(all_correlation_results, correlation_results)
}

# Convert the list to a data frame for easy viewing
correlation_df <- data.frame(
    Region = names(all_correlation_results),
    Correlation = sapply(all_correlation_results, function(x) x$Correlation),
    P_Value = sapply(all_correlation_results, function(x) x$P_Value),
    row.names = NULL
)

# Apply the Benjamini-Hochberg correction to the p-values
correlation_df$Adjusted_P_Value <- p.adjust(correlation_df$P_Value, method = "BH")

# Print the results
print(correlation_df)
                     Region Correlation       P_Value Adjusted_P_Value
1        L_G&S_frontomargin   0.9441809  6.240770e-53     8.552166e-53
2       L_G&S_occipital_inf   0.8387737  9.485030e-30     9.614962e-30
3         L_G&S_paracentral   0.8879535  1.529169e-37     1.651949e-37
4          L_G&S_subcentral   0.9466024  6.338163e-54     9.107264e-54
5    L_G&S_transv_frontopol   0.9732963  1.423180e-69     3.510510e-69
6          L_G&S_cingul-Ant   0.8900339  5.989882e-38     6.566685e-38
7      L_G&S_cingul-Mid-Ant   0.9657367  6.396050e-64     1.279210e-63
8     L_G&S_cingul-Mid-Post   0.9682624  1.179832e-65     2.567869e-65
9    L_G_cingul-Post-dorsal   0.9745212  1.219494e-70     3.342318e-70
10  L_G_cingul-Post-ventral   0.9874266  9.419893e-87     8.713401e-86
11               L_G_cuneus   0.9739739  3.708802e-70     9.463840e-70
12  L_G_front_inf-Opercular   0.9056285  2.748013e-41     3.152759e-41
13    L_G_front_inf-Orbital   0.9600228  1.956852e-60     3.489327e-60
14   L_G_front_inf-Triangul   0.9049802  3.882216e-41     4.419754e-41
15         L_G_front_middle   0.9454341  1.935779e-53     2.728527e-53
16            L_G_front_sup   0.8948487  6.355426e-39     7.019426e-39
17    L_G_Ins_lg&S_cent_ins   0.9517895  3.226544e-56     4.922974e-56
18        L_G_insular_short   0.9557224  3.931883e-58     6.465764e-58
19     L_G_occipital_middle   0.9371754  2.733551e-50     3.517961e-50
20        L_G_occipital_sup   0.9780833  4.568502e-74     1.469866e-73
21  L_G_oc-temp_lat-fusifor   0.8991228  7.903988e-40     8.862048e-40
22  L_G_oc-temp_med-Lingual   0.9618520  1.714612e-61     3.094666e-61
23  L_G_oc-temp_med-Parahip   0.9429966  1.840812e-52     2.476729e-52
24              L_G_orbital   0.9587014  1.059563e-59     1.802476e-59
25   L_G_pariet_inf-Angular   0.9719775  1.770169e-68     4.093515e-68
26  L_G_pariet_inf-Supramar   0.9898107  1.450111e-91     1.788470e-90
27         L_G_parietal_sup   0.9819521  1.708513e-78     9.030714e-78
28          L_G_postcentral   0.9858249  5.201263e-84     4.051510e-83
29           L_G_precentral   0.9785582  1.448158e-74     4.984356e-74
30            L_G_precuneus   0.9817961  2.684862e-78     1.324532e-77
31               L_G_rectus   0.9892063  3.027240e-90     2.986876e-89
32          L_G_subcallosal   0.9701152  5.106715e-67     1.145142e-66
33  L_G_temp_sup-G_T_transv   0.9832625  3.252148e-80     2.005491e-79
34     L_G_temp_sup-Lateral   0.9763569  2.430148e-72     6.916574e-72
35  L_G_temp_sup-Plan_polar   0.9785819  1.366686e-74     4.967808e-74
36  L_G_temp_sup-Plan_tempo   0.9960224 3.790300e-113    5.609643e-111
37         L_G_temporal_inf   0.9656161  7.680761e-64     1.515670e-63
38      L_G_temporal_middle   0.9739895  3.594727e-70     9.333677e-70
39   L_Lat_Fis-ant-Horizont   0.9787217  9.693283e-75     3.678477e-74
40   L_Lat_Fis-ant-Vertical   0.9918993  8.028713e-97     1.980416e-95
41           L_Lat_Fis-post   0.9934494 1.078707e-101    3.991215e-100
42         L_Pole_occipital   0.9369329  3.331233e-50     4.250194e-50
43          L_Pole_temporal   0.9155653  9.843325e-44     1.138134e-43
44            L_S_calcarine   0.9793646  1.938492e-75     7.549916e-75
45              L_S_central   0.9916187  4.844566e-96     8.962448e-95
46    L_S_cingul-Marginalis   0.9831002  5.400698e-80     3.197213e-79
47  L_S_circular_insula_ant   0.9769093  7.042333e-73     2.084530e-72
48  L_S_circular_insula_inf   0.9916912  3.062216e-96     6.474399e-95
49  L_S_circular_insula_sup   0.9319099  1.694422e-48     2.072516e-48
50    L_S_collat_transv_ant   0.9823310  5.602706e-79     3.071113e-78
51   L_S_collat_transv_post   0.9450398  2.806014e-53     3.917830e-53
52            L_S_front_inf   0.9660544  3.936343e-64     8.205335e-64
53         L_S_front_middle   0.9735392  8.824335e-70     2.213562e-69
54            L_S_front_sup   0.9646310  3.345863e-63     6.348560e-63
55   L_S_interm_prim-Jensen   0.9785678  1.414519e-74     4.984356e-74
56  L_S_intrapariet&P_trans   0.9779550  6.204550e-74     1.953773e-73
57    L_S_oc_middle&Lunatus   0.9222308  1.503869e-45     1.766450e-45
58   L_S_oc_sup&transversal   0.9621160  1.194726e-61     2.182955e-61
59        L_S_occipital_ant   0.9527094  1.190559e-56     1.854765e-56
60          L_S_oc-temp_lat   0.9814668  6.885094e-78     3.184356e-77
61  L_S_oc-temp_med&Lingual   0.9670135  8.833665e-65     1.867689e-64
62      L_S_orbital_lateral   0.8700354  2.444591e-34     2.547884e-34
63   L_S_orbital_med-olfact   0.9785790  1.376217e-74     4.967808e-74
64     L_S_orbital-H_Shaped   0.9707998  1.521968e-67     3.465404e-67
65    L_S_parieto_occipital   0.9844438  6.930263e-82     4.884186e-81
66         L_S_pericallosal   0.9477710  2.022773e-54     2.964063e-54
67          L_S_postcentral   0.9781059  4.327602e-74     1.423300e-73
68  L_S_precentral-inf-part   0.9695648  1.324280e-66     2.925276e-66
69  L_S_precentral-sup-part   0.8710740  1.643642e-34     1.725242e-34
70           L_S_suborbital   0.9908692  4.451610e-94     7.320426e-93
71          L_S_subparietal   0.9839788  3.261221e-81     2.193912e-80
72         L_S_temporal_inf   0.9635707  1.557520e-62     2.917885e-62
73         L_S_temporal_sup   0.9725570  5.937132e-69     1.440485e-68
74  L_S_temporal_transverse   0.9775971  1.443155e-73     4.358918e-73
75       R_G&S_frontomargin   0.8891843  8.802778e-38     9.579494e-38
76      R_G&S_occipital_inf   0.9308154  3.833131e-48     4.612222e-48
77        R_G&S_paracentral   0.8987629  9.454142e-40     1.052040e-39
78         R_G&S_subcentral   0.9741369  2.670095e-70     7.056679e-70
79   R_G&S_transv_frontopol   0.9413515  7.963813e-52     1.052361e-51
80         R_G&S_cingul-Ant   0.8802103  4.282435e-36     4.592757e-36
81     R_G&S_cingul-Mid-Ant   0.9818162  2.533406e-78     1.292911e-77
82    R_G&S_cingul-Mid-Post   0.9439760  7.537901e-53     1.023495e-52
83   R_G_cingul-Post-dorsal   0.9893147  1.777174e-90     1.878727e-89
84  R_G_cingul-Post-ventral   0.9723510  8.777566e-69     2.062031e-68
85               R_G_cuneus   0.9524624  1.559069e-56     2.403565e-56
86  R_G_front_inf-Opercular   0.9598653  2.400206e-60     4.228934e-60
87    R_G_front_inf-Orbital   0.9872547  1.926039e-86     1.676787e-85
88   R_G_front_inf-Triangul   0.9310460  3.230962e-48     3.919528e-48
89         R_G_front_middle   0.9484056  1.075851e-54     1.608343e-54
90            R_G_front_sup   0.8595594  1.117939e-32     1.157028e-32
91    R_G_Ins_lg&S_cent_ins   0.9262081  1.034250e-46     1.224552e-46
92        R_G_insular_short   0.9658190  5.642899e-64     1.144040e-63
93     R_G_occipital_middle   0.8790151  7.016927e-36     7.471260e-36
94        R_G_occipital_sup   0.9530118  8.541252e-57     1.344793e-56
95  R_G_oc-temp_lat-fusifor   0.9397683  3.134632e-51     4.105535e-51
96  R_G_oc-temp_med-Lingual   0.8320543  6.855260e-29     6.855260e-29
97  R_G_oc-temp_med-Parahip   0.9781895  3.541331e-74     1.191175e-73
98              R_G_orbital   0.9461587  9.713781e-54     1.382346e-53
99   R_G_pariet_inf-Angular   0.9900036  5.289795e-92     7.117179e-91
100 R_G_pariet_inf-Supramar   0.9810918  1.971430e-77     8.841564e-77
101        R_G_parietal_sup   0.9532546  6.531787e-57     1.039467e-56
102         R_G_postcentral   0.9659177  4.853196e-64     9.976014e-64
103          R_G_precentral   0.9829091  9.752930e-80     5.551668e-79
104           R_G_precuneus   0.9893391  1.575822e-90     1.794013e-89
105              R_G_rectus   0.9670953  7.760994e-65     1.664677e-64
106         R_G_subcallosal   0.9427736  2.250722e-52     3.000963e-52
107 R_G_temp_sup-G_T_transv   0.9765764  1.490887e-72     4.326496e-72
108    R_G_temp_sup-Lateral   0.9744812  1.323875e-70     3.562427e-70
109 R_G_temp_sup-Plan_polar   0.9364426  4.957588e-50     6.271137e-50
110 R_G_temp_sup-Plan_tempo   0.9940436 7.093169e-104    3.499297e-102
111        R_G_temporal_inf   0.9805210  9.397794e-77     3.973924e-76
112     R_G_temporal_middle   0.9794203  1.682196e-75     6.728784e-75
113  R_Lat_Fis-ant-Horizont   0.9809646  2.803181e-77     1.220208e-76
114  R_Lat_Fis-ant-Vertical   0.9382120  1.163111e-50     1.510004e-50
115          R_Lat_Fis-post   0.9902001  1.856328e-92     2.747365e-91
116        R_Pole_occipital   0.9867998  1.221581e-85     1.004411e-84
117         R_Pole_temporal   0.9030309  1.081088e-40     1.221382e-40
118           R_S_calcarine   0.9850460  8.681774e-83     6.424513e-82
119             R_S_central   0.9359165  7.568053e-50     9.492134e-50
120   R_S_cingul-Marginalis   0.9626403  5.786859e-62     1.070569e-61
121 R_S_circular_insula_ant   0.9595257  3.718551e-60     6.474653e-60
122 R_S_circular_insula_inf   0.9801970  2.233655e-76     9.182804e-76
123 R_S_circular_insula_sup   0.9347096  1.970814e-49     2.451097e-49
124   R_S_collat_transv_ant   0.9946453 2.548284e-106    1.885730e-104
125  R_S_collat_transv_post   0.9286010  1.920165e-47     2.291809e-47
126           R_S_front_inf   0.9554797  5.219772e-58     8.489300e-58
127        R_S_front_middle   0.8714062  1.446581e-34     1.529243e-34
128           R_S_front_sup   0.9174505  3.127269e-44     3.644377e-44
129  R_S_interm_prim-Jensen   0.9816800  3.749314e-78     1.789995e-77
130 R_S_intrapariet&P_trans   0.9580532  2.378127e-59     3.999577e-59
131   R_S_oc_middle&Lunatus   0.8476294  6.065221e-31     6.190708e-31
132  R_S_oc_sup&transversal   0.9472525  3.368279e-54     4.887307e-54
133       R_S_occipital_ant   0.9837397  7.107497e-81     4.573520e-80
134         R_S_oc-temp_lat   0.9563044  1.979707e-58     3.292097e-58
135 R_S_oc-temp_med&Lingual   0.9779272  6.627961e-74     2.043621e-73
136     R_S_orbital_lateral   0.8498671  2.944711e-31     3.026508e-31
137  R_S_orbital_med-olfact   0.9320631  1.509935e-48     1.862254e-48
138    R_S_orbital-H_Shaped   0.8323132  6.362609e-29     6.405892e-29
139   R_S_parieto_occipital   0.9933773 1.923856e-101    5.694615e-100
140        R_S_pericallosal   0.9759544  5.884867e-72     1.643321e-71
141         R_S_postcentral   0.9482349  1.275976e-54     1.888445e-54
142 R_S_precentral-inf-part   0.9651018  1.664798e-63     3.199872e-63
143 R_S_precentral-sup-part   0.9445751  4.331069e-53     5.990638e-53
144          R_S_suborbital   0.9651162  1.629301e-63     3.172849e-63
145         R_S_subparietal   0.9723543  8.723141e-69     2.062031e-68
146        R_S_temporal_inf   0.9489402  6.281677e-55     9.486614e-55
147        R_S_temporal_sup   0.9594341  4.181644e-60     7.196318e-60
148 R_S_temporal_transverse   0.9542736  2.085742e-57     3.355324e-57
Expand Code
# Convert the data to a suitable format for ggplot
correlation_df$Region <- factor(correlation_df$Region, levels = correlation_df$Region)

# Create a scatter plot for each region showing the correlation values
# Assuming 'merged_data' contains corresponding columns for HBR and BLR scores
p <- ggplot(correlation_df, aes(x = Region, y = Correlation)) +
  geom_point(stat = "identity") +
  ylim(0.8, 1) +
  labs(title = "Correlation of Deviation Scores Across Brain Regions",
       x = "Brain Region", y = "Correlation Coefficient") +
  theme_minimal() +
  theme(axis.text.x = element_text(angle = 90, hjust = 1, vjust = 0.5, size = 5.5))

# Print the plot
print(p)

Expand Code
# Adding a new column for significance level based on p-values
correlation_df$Significance <- ifelse(correlation_df$Adjusted_P_Value < 0.001, '***',
                                      ifelse(correlation_df$Adjusted_P_Value < 0.01, '**',
                                      ifelse(correlation_df$Adjusted_P_Value < 0.05, '*', 'ns')))

# Create the lollipop plot
ggplot(correlation_df, aes(x = reorder(Region, Correlation), y = Correlation, color = Significance)) +
    geom_segment(aes(xend = Region, yend = 0.8), size = 0.25) +  # Create lines from 0 to the correlation value
    geom_point(size = 1) +  # Add points for each correlation value
    scale_color_manual(values = c('***' = 'red', '**' = 'orange', '*' = 'blue', 'ns' = 'grey')) +  # Assign colors based on significance
    labs(
        title = "Correlation and Significance of Brain Regions: Hierarchical vs. Linear Bayesian Regression",
        x = "Brain Region",
        y = "Correlation Coefficient",
        color = "Significance Level"
    ) +
    theme_minimal() +
    theme(
        legend.position = "top",  # Move the legend to the top of the plot
        axis.text.x = element_text(angle = 90, hjust = 1, vjust = 0.5, size = 5.5),  # Improve readability of x-axis labels
        legend.title = element_text(face = "bold")  # Bold the legend title for emphasis
    )

Session Information


To enhance reproducibility, details about the working environment used for these analyses can be found below.

Expand Code
sessionInfo()
R version 4.4.0 (2024-04-24)
Platform: aarch64-apple-darwin20
Running under: macOS Sonoma 14.6.1

Matrix products: default
BLAS:   /Library/Frameworks/R.framework/Versions/4.4-arm64/Resources/lib/libRblas.0.dylib 
LAPACK: /Library/Frameworks/R.framework/Versions/4.4-arm64/Resources/lib/libRlapack.dylib;  LAPACK version 3.12.0

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

time zone: America/Chicago
tzcode source: internal

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] ggpubr_0.6.0              reshape2_1.4.4           
 [3] ggeffects_1.7.0           effects_4.2-2            
 [5] carData_3.0-5             bitops_1.0-7             
 [7] caTools_1.18.2            lavaan_0.6-17            
 [9] ggbeeswarm_0.7.2          sjPlot_2.8.16            
[11] fastDummies_1.7.3         lubridate_1.9.3          
[13] forcats_1.0.0             purrr_1.0.2              
[15] tibble_3.2.1              tidyverse_2.0.0          
[17] caret_6.0-94              lattice_0.22-6           
[19] patchwork_1.2.0           gridExtra_2.3            
[21] broom_1.0.6               RPMG_2.2-7               
[23] proxy_0.4-27              plot.matrix_1.6.2        
[25] ggplot2_3.5.1             e1071_1.7-14             
[27] data.table_1.15.4         cowplot_1.1.3            
[29] ggsegDesterieux_1.0.1.002 ggsegExtra_1.5.33.004    
[31] ggseg3d_1.6.3             ggseg_1.6.6              
[33] matrixStats_1.3.0         stringr_1.5.1            
[35] readr_2.1.5               tidyr_1.3.1              
[37] psych_2.4.6.26            kableExtra_1.4.0         
[39] knitr_1.48                dplyr_1.1.4              
[41] pacman_0.5.1             

loaded via a namespace (and not attached):
  [1] splines_4.4.0        R.oo_1.26.0          datawizard_0.11.0   
  [4] hardhat_1.3.1        pROC_1.18.5          rpart_4.1.23        
  [7] lifecycle_1.0.4      rstatix_0.7.2        sf_1.0-16           
 [10] pbmcapply_1.5.1      vroom_1.6.5          globals_0.16.3      
 [13] MASS_7.3-60.2        insight_0.20.1       survey_4.4-2        
 [16] backports_1.5.0      magrittr_2.0.3       plotly_4.10.4       
 [19] rmarkdown_2.27       yaml_2.3.9           rgdal_1.6-7         
 [22] sp_2.1-4             RColorBrewer_1.1-3   minqa_1.2.7         
 [25] DBI_1.2.3            abind_1.4-5          quadprog_1.5-8      
 [28] R.utils_2.12.3       rgl_1.3.1            nnet_7.3-19         
 [31] ipred_0.9-14         lava_1.8.0           listenv_0.9.1       
 [34] terra_1.7-78         units_0.8-5          performance_0.12.0  
 [37] parallelly_1.37.1    svglite_2.1.3        codetools_0.2-20    
 [40] xml2_1.3.6           neurobase_1.32.4     tidyselect_1.2.1    
 [43] raster_3.6-26        farver_2.1.2         effectsize_0.8.9    
 [46] lme4_1.1-35.5        stats4_4.4.0         base64enc_0.1-3     
 [49] PupillometryR_0.0.5  jsonlite_1.8.8       survival_3.5-8      
 [52] iterators_1.0.14     systemfonts_1.1.0    RNifti_1.7.0        
 [55] smoothr_1.0.1        foreach_1.5.2        tools_4.4.0         
 [58] Rcpp_1.0.12          glue_1.7.0           mnormt_2.1.1        
 [61] prodlim_2023.08.28   Rttf2pt1_1.3.12      xfun_0.45           
 [64] withr_3.0.0          fastmap_1.2.0        mitools_2.4         
 [67] boot_1.3-30          fansi_1.0.6          digest_0.6.36       
 [70] timechange_0.3.0     R6_2.5.1             colorspace_2.1-0    
 [73] freesurfer_1.6.10    jpeg_0.1-10          R.methodsS3_1.8.2   
 [76] utf8_1.2.4           generics_0.1.3       recipes_1.0.10      
 [79] class_7.3-22         httr_1.4.7           htmlwidgets_1.6.4   
 [82] parameters_0.22.0    ModelMetrics_1.2.2.2 pkgconfig_2.0.3     
 [85] gtable_0.3.5         timeDate_4032.109    htmltools_0.5.8.1   
 [88] scales_1.3.0         snakecase_0.11.1     gower_1.0.1         
 [91] rstudioapi_0.16.0    tzdb_0.4.0           oro.nifti_0.11.4    
 [94] nloptr_2.1.1         nlme_3.1-164         sjlabelled_1.2.0    
 [97] KernSmooth_2.23-22   parallel_4.4.0       vipor_0.4.7         
[100] extrafont_0.19       pillar_1.9.0         grid_4.4.0          
[103] vctrs_0.6.5          car_3.1-2            extrafontdb_1.0     
[106] beeswarm_0.4.0       evaluate_0.24.0      pbivnorm_0.6.0      
[109] magick_2.8.4         cli_3.6.3            compiler_4.4.0      
[112] crayon_1.5.3         rlang_1.1.4          ggsignif_0.6.4      
[115] future.apply_1.11.2  labeling_0.4.3       classInt_0.4-10     
[118] plyr_1.8.9           sjmisc_2.8.10        stringi_1.8.4       
[121] viridisLite_0.4.2    stars_0.6-6          munsell_0.5.1       
[124] lazyeval_0.2.2       bayestestR_0.13.2    Matrix_1.7-0        
[127] sjstats_0.19.0       hms_1.1.3            bit64_4.0.5         
[130] future_1.33.2        haven_2.5.4          geomorph_4.0.8      
[133] bit_4.0.5            ape_5.8              RRPP_2.0.3